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We use a recently developed coarse-grained model for DNA to study kissing complexes formed by hybridization of 
complementary hairpin loops. The binding of the loops is topologically constrained because their linking number must 
remain constant. By studying systems with linking numbers -1, or 1 we show that the average number of interstrand 
base pairs is larger when the topology is more favourable for the right-handed wrapping of strands around each other. 
The thermodynamic stability of the kissing complex also decreases when the linking number changes from -1 to to 
1 . The structures of the kissing complexes typically involve two intermolecular helices that coaxially stack with the 
hairpin stems at a parallel four-way junction. 



I. INTRODUCTION 

Not only is DNA the genetic information carrier of life, but, 
given the degree of control achieved in the chemistry of DNA^i 
— molecule synthesis is fast, reliable and relatively cheap — 
these information-rich building blocks can be exploited to re- 
liably self-assemble two- and three-dimensional structures'"^ 
and to build functional nanodevicesi^ 

Hairpins probably represent the simplest structure that 
DNA can form besides the standard double helix. These 
are secondary structure motifs formed by single-stranded 
DNA molecules that have complementary regions that self- 
hybridize. The intramolecular double helix formed from the 
self-complementary sections is known as stem or neck, while 
the section that connects two of the stem ends is called a loop. 

In contrast to RNA, which for the most part is single- 
stranded in vivo, so that hairpins are a common structural 
elementj^ DNA in vivo is mostly in its duplex form. Never- 
theless, there are occasions when it is single-stranded, and 
examples have been identified where DNA hairpins play a 
biological role^ including in replication, transcription and 
recombinationi^i^ However, hairpin formation can sometimes 
be an undesired process, and has been implicated in certain 
diseases] '"'" 

DNA hairpins also play an important role in DNA 
nanotechnologyiSi- and can be used as the "fuel" to provide 
the energy to power autonomous DNA motors, although they 
can also be an unwanted secondary structural motif in DNA 
designed to be unstructuredj^^ 

In this paper we study "kissing" complexes that can form 
when the loops of two DNA hairpins are complementary and 
partially hybridize. In particular, we focus on the interplay 
between topology and the shape and stability of these com- 
plexes. For example, when two hairpin loops hybridize, the 
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right-handed wrapping of the DNA strands in the intermolec- 
ular double helix must be compensated by a region where the 
loops wrap around each other in the opposite sense. Thus, 
even when the two loops are fully complementary, topological 
effects will restrict the number of bonds that can be formed. 

Kissing loop interactions are also an important RNA ter- 
tiary structure motif^S and play key biological roles in pro- 
cesses such as the regulatory action of antisense RNAs and 
the dimerization of viral genomic RNA."' They have there- 
fore been much better characterized for RNA, both in terms 
of their structur&2i~ and mechanical properties^2^ than for 
DNA. This structural knowledge has even been exploited in 
structural RNA nanotechnology, where kissing loop interac- 
tions have also been used as a means to join RNA components 
with a well-defined geometry i21"— Although these RNA sys- 
tems provide an interesting comparison, the kissing loop in- 
teractions typically involve shorter sequences of complemen- 
tary bases than for the DNA systems we consider here, and 
so topological effects are less significant. Interestingly, the 
NMR solution structure of a DNA kissing complex has been 
obtained for sequences analogous to that of a previously char- 
acterized RNA kissing complex. -° Although there are differ- 
ences in the details of the two structures, they are generally 
very similar. 

The topological effects associated with kissing loop inter- 
actions have been exploited in DNA nanotechnology, particu- 
larly to allow the design of autonomous motorS] '-^'''*'-^''-^' One 
way of driving a DNA nanodevice through a cycle is through 
the use of complementary single-stranded DNA strands as 
"fuel", where the first strand is designed to partially hybridize 
with the device to induce a conformational change, and the 
second then reverses this change by displacing it to form a 
"waste" duplex. The first example of such a device was DNA 
nanotweezers, where the strands induced the device to open 
and closed However, one problem with such a device is the 
two strands have to be added sequentially, since, if both are 
present at the same time, they will preferentially directly hy- 
bridize with each other rather than with the device. 

One way to circumvent this problem is through the use of 
fuel strands that can form hairpinsji^iii since the topological 



restriction on binding between the hairpin loops will effec- 
tively prevent the duplexes from being formed, even though 
the duplexes are more stable. Given that the two strands are 
complementary, these hairpins naturally form kissing com- 
plexes. While the hairpins are unable to open each other's 
stems by displacement, the motor can be designed to be a cat- 
alyst for the hybridization. By having a single-stranded DNA 
section that both is partially complementary to one of the hair- 
pins and has a free end, the motor can open the hairpin by dis- 
placement, unconstrained by topological effects. A second, 
and similar, solution is to prepare the fuel strands complexed 
to partially complementary protective strands that bind to ei- 
ther end of the fuels but leave a loop region in the middle 
unhybridized.i^i^ A variety of autonomous motors have been 
designed based on these principles.i^— 

Here, we investigate the system of fully complementary 
40 base DNA hairpins studied by Bois et alJ^ using com- 
puter simulations of a nucleotide-level coarse-grained model 
of DNAJ^i^ This recently introduced model provides an ex- 
cellent description of the structural, thermodynamic and me- 
chanical properties of both single-stranded and duplex DNA, 
and has now made it feasible to study the free energy land- 
scapes of such DNA nanotechnology systems in detail, as 
previously illustrated for DNA nanotweezers.— In particular, 
we focus on the effects of topology on the free energy land- 
scape for the binding of the hairpin loops, and how the struc- 
ture of the resulting kissing complex reflects these topological 
constraints. To further illustrate the role played by the topol- 
ogy, we also consider kissing complex formation in systems 
of linked hairpins. 



II. MODEL AND METHODS 



A. Model 



We use the coarse-grained DNA model developed by 
Ouldridge et. al.M^'^ In this model, a DNA strand is described 
as a polymer of nucleotides that interact via excluded vol- 
ume repulsion and anisotropic attractive potentials that mimic 
the Watson-Crick base-pairing, stacking, cross stacking and 
coaxial stacking. The model has been parameterized to repro- 
duce the structural and thermodynamic properties of single - 
stranded and double-stranded DNA molecules at the high salt 
concentrations that are typically used in DNA nanotechnology 
applications. Since this model is described in detail in Ref.M 
we shall repeat here only the fundamental ingredients. 

Each nucleotide is represented as a rigid body with three in- 
teraction sites, all on the same axis. Although the interaction 
sites are collinear, we stress that a nucleotide does not pos- 
sess cylindrical symmetry, since the potential also depends on 
a vector perpendicular to the nucleotide axis to capture the 
effects of the orientation of a base on the interactions. 



The potential energy V can be written as: 



V — 2_^ (^backbone + ^tack + Kxc) + 
nn 

/ ^ [VuB + Ki:ross. stack + V'coax stack + K;xc) 
other pairs 



(1) 



The first sum runs over all pairs of nucleotides that are ad- 
jacent along a strand (neighbours in our terminology) and 
the second sum runs over all other pairs. The interaction 
between neighbours consists of a backbone term that is de- 
signed to represent the connectivity of a DNA strand, a stack- 
ing term that is designed to mimic stacking interaction be- 
tween nucleotides, and an excluded volume part that pre- 
vents nucleotides from overlapping. The interaction between 
non-neighbouring pairs consists of four different terms: (i) 
a hydrogen-bonding term that mimics directional Watson- 
Crick base-pairing; (ii) a cross-stacking term that accounts 
for stacking interactions between nucleotides that are second 
neighbours on different strands; (iii) a coaxial stacking term 
that is designed to capture stacking interactions between non- 
neighbouring bases; and (iv) an excluded volume term. The 
full forms of each of these terms is reported in Ref. |34l with 
the exception of the coaxial stacking term, which is described 
in Ref. \3^ Its parametrization will be described in detail 
elsewhere i^ 

Features of the model that are particularly important for 
the current study are the relative flexibility of single-stranded 
DNA and its ability to describe the thermodynamics of hair- 
pin formation accurately, as well as hybridization in gen- 
eral. We are also confident of the general robustness of 
the model, based on the wide range of DNA systems on 
which wai^i^^j^ and others^ have tested the model. These so 
far include DNA nanotweezers.r^ "burnt bridges'— and two- 
footed^ DNA walkers, as well as processes such as DNA 
displacement)^ overstretchingii and cruciform formation^ 
and the formation of liquid crystalline phases.~ 

However, we should also note that the model does introduce 
a significant level of coarse graining and neglects several fea- 
tures of the DNA structure and interactions. Firstly, all four 
nucleotides have the same structure and interaction proper- 
ties, except for the hydrogen-bonding term, for which interac- 
tions are only allowed between Watson-Crick complementary 
bases. Although this approximation of course precludes the 
study of much of the sequence dependence of properties and 
behaviour, it is not a problem when, as here, we are interested 
in the generic behaviour of a system. Secondly, the double 
helix in our model is symmetrical rather than having different 
sizes for the minor and major grooves. Again this is unlikely 
to be an issue, unless we are interested in the DNA structure 
at a quite fine level of detail. Finally, the interactions have 
been fitted for a single, fairly high, salt concentration (namely, 
0.5M), where the Debye screening length is short. This is the 
regime relevant to most DNA nanotechnology experiments. 
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FIG. 1. (a) The definitions that we use for the signs associated 
with the crossing of two curves. Schematic representations of (b) 
the three topological configurations of DNA hairpins and studied in 
this paper and (c) a control system with no topological or geometric 
constraints. The different topological configurations in (b) are: (i) 
Topologically unlinked, linking number Lk = 0. (ii) Topologically 
favoured, Lk = —1. (iii) Topologically frustrated Lk = +1. 



B. Simulation Methods 



Throughout this work, we use Monte Carlo simulations em- 
ploying the Virtual Move algorithm (VMMC) introduced by 
Whitelam and coworkers4i^ The latter is a modification to 
the standard (Metropolis) Monte Carlo algorithm specifically 
designed to promote the collective diffusion of highly inter- 
acting clusters that would otherwise be suppressed. In our 
model DNA strands are effectively clusters of interacting nu- 
cleotides and that VMMC significantly speeds up sampling, 
particularly when strand diffusion is important, which is the 
case when studying hybridization processes. 

Because of the presence of large free-energy barriers, we 
have used umbrella samplingi^ to accurately sample transi- 
tions between different states. In practice, this is accom- 
plished by adding an additional term to the system Hamilto- 
nian designed to flatten the free energy profile along a par- 
ticular reaction coordinate, and then subsequently unbiasing 
the results4i In the present case, the natural choice for the re- 
action coordinate is the number of base pairs between the two 
strands. This choice requires a definition of a base pairing, and 
we define a pair of nucleotides as base paired if the hydrogen 
bonding interaction term between them is at least 0.093 times 
the well depth. Of course this choice is somewhat arbitrary, 
but changing the threshold does not significantly alter the re- 
sults. 



C. DNA Sequences 

We have studied the same 40-base nucleotide sequences as 
in Ref. [Tj. The two DNA strands are fully complementary, 
and so can form a duplex as well as hairpins with a stem of 10 
base pairs and a loop of 20 bases. The sequences of the two 
strands are, in 5' to 3' direction: 

gcgttgctgc-attttactcttctcccctcg-gcagcaacgc 

and 

gcgttgctgc-cgaggggagaagagtaaaat-gcagcaacgc 

where the hyphens separate stem and loop regions. All the 
results we present are at room temperature, taken as T := 
296.15 K (23 °C). This compares to a hairpin melting temper- 
ature of around 350 K. Hence, at the temperature we consider, 
the probability of spontaneous hairpin opening is effectively 
zero in our simulations. 



D. DNA Topology 

A commonly used property to characterise systems with 
respect to their topology is the linking number, Lk, a num- 
ber that describes how two closed curves are linked in three- 
dimensional space. Given the projection of two closed curves 
onto any plane, a crossing is taken to be positive (negative) 
if the upper curve can be superimposed onto the lower by 
a counterclockwise (clockwise) rotation (see Fig. [TJa))- Ac- 
cording to this definition, each crossing in the right-handed 
helix formed by dsDNA is negative as the strands in the helix 
are antiparallel^ The linking number Lk is then defined as 



Lk 



oE^^ 



(2) 



where the index i runs over the crossings with Ci = +1 for 
a positive crossing and Cj = — 1 for a negative crossing. In 
more intuitive terms, Lk is the number of times that each 
curve wraps around the other For a detailed discussion of 
the linking number in the context of nucleic acids we refer the 
reader to Ref. ^^ 

Although we are considering a system of two unclosed 
DNA strands, because the rate of hairpin opening is negligible 
in our simulations at the temperature we consider, the hair- 
pin loops can be effectively considered as closed loops, and 
so can exhibit topologically different states (Fig. [lib)). The 
most experimentally relevant state is that with Lk = 0, as the 
likelihood that two hairpin loops would interlink during their 
formation process is very low. Since linking number is con- 
served, in this state any negative crossings of the two hairpin 
loops due to hybridization of the complimentary loops must 
be compensated by positive crossings, i.e. sections where the 
loops wind round each other in the opposite sense. This topo- 
logical effect will frustrate the hybridization process. 




FIG. 2. Typical structures assumed by (a) a single hairpin and (b) 
the topologically unlinked (Lk = 0) kissing complex. Note that 
in (a) it almost looks as if the stem is longer than 10 base pairs, 
because the stacking tends to propagate beyond the end of the stem 
at this temperature. For (b) the chosen structure has 14 interstrand 
base pairs. To its right is a topological sketch of the configuration 
illustrating that the zero linking number is achieved by balancing 
positive and negative crossings. In panels (c)-(e) we show example 
structures for partially formed complexes with a total of 2, 6 and 10 
base pairs formed between the loops, respectively. 



III. RESULTS 

A. Topologically unlinked complex; Lk = 

Firstly, we consider two hairpins that are topologically un- 
linked (Fig.[TJb)(i)) and are free to bind through the comple- 
mentary loop regions. One of the unbound hairpins is illus- 
trated in Fig. 12 a). The room temperature free energy profile 
for bonding is reported in Fig.|3] The jump associated with the 
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FIG. 3. Free energy profile of two complementary hairpins that are 
topologically unlinked (i.e. Lk = 0) at T = 23 °C at a single strand 
concentration of 0.336 mM (squares). The free energy profile for 
hybridization of the control system (Fig. [TJc)) is also plotted (cir- 
cles). In the inset, the full profiles are plotted, showing the large 
(> 30 ksT) free energy difference between the most stable states of 
the kissing complex and the control system. The free energy profiles 
have been taken to have the same value when the number of base 
pairs is 1, assuming the same value for the association barrier 
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FIG. 4. Bonding probability as a function of nucleotide position in 
the loop for different A^'tot, the total number of correct base pairs in 
the kissing complex. Note that the probability is not completely sym- 
metric around the centre of the loop sequence because the propensity 
to form non-native base pairs depends on which native base pairs are 
formed. 



formation of the first bond is due to the loss of translational en- 
tropy associated with hybridization, and is dependent on con- 
centration as well as temperature. Our data were collected 
with two strands in a volume of 4944 nm^, i.e. 0.336 mM. 

For the hybridization of a duplex in our model, we have pre- 
viously shown that, after the barrier for forming the first base 
pair, there is then a linear decrease in the free energy as the 
number of base pairs increases (aside from a possible small 
rise at the end due to the fraying of the ends of the duplex).— 
The behaviour seen for the binding of the two hairpin loops 
is significantly different from this scenario. After an initial 
roughly linear decrease (up to about 3-4 base pairs), the line 



begins to exhibit some positive curvature reaching a minimum 
at 14 base pairs (roughly one and a half helical turns) before 
rising steeply. This curvature is a result of the topological con- 
straint that as the two hairpin loops wind around each other to 
form a duplex the linking number must remain constant, and 
this constraint is increasingly felt as the number of base pair 
increases. 

To compare the thermodynamics to a system where topo- 
logical constraints do not play a role we introduce a control 
system where the hairpin loops have been opened by break- 
ing the backbones between bases 10 and 11 of the first strand 
and 30 and 31 of the second strand, as shown in Fig. [Ifc). 
The free-energy profile of the control shows the expected lin- 
ear decrease in free energy for hybridization in the absence 
of topological effects. It is also worth noting that this system 
gains 3QkBT more in free energy from hybridization than the 
hairpins do from forming the kissing complex. This value is 
roughly the amount of free energy stored in the metastable 
kissing complex (the most stable state for this system is the 
fuU duplex, since the two sequences are fully complemen- 
tary). In DNA nanotechnology systems where such hairpins 
are used as fuels, this would be the amount of free energy that 
is potentially available to do work. 

It is also interesting to look at the structure of the kiss- 
ing complex as hybridization progresses. Fig. |2] shows ex- 
ample structures with different numbers of base pairs, and in 
Fig.|4]the probability that a given base is bound as a function 
of its position along the loop is depicted for different num- 
bers of total base pairs. For kissing complexes with a few 
base pairs there is not a strong thermodynamic preference for 
binding at a particular position in the loops, hence the distri- 
bution in Fig. |4]is roughly uniform. The exceptions are the 
first and last bases in the loops for which base-pairing is dis- 
favoured, presumably because the more crowded environment 
that would result makes binding at these positions entropically 
less favourable. 

By contrast, for kissing complexes with the most favourable 
number of base pairs (A^tot = 14), there is a very clear pattern 
for bonding. The six bases closest to each stem are invari- 
ably base-paired, while the four central bases have virtually 
no probability of binding. Therefore, the structure of the en- 
semble of configurations with 14 base pairs are all very simi- 
lar to that in Fig.HJb). We should also note that this structure 
is very different from the typical schematics of kissing com- 
plexes that tend to assume a single hybridized region, nor- 
mally between the central regions of the loops. 

The reason for this well defined pattern of bonding becomes 
clear when we examine the structure in more detail. It is sim- 
ply the best way to maximise the base pairing whilst satisfy- 
ing the topological constraints. In particular, the base pairing 
adjacent to each stem continues the two helices formed by 
the stems (i.e. there is coaxial stacking) and there is a paral- 
lel four- way junction^" at the coaxial stacking site associated 
with the exchange of strands between the two helices. Impor- 
tantly, as the junction is parallel, the strand exchange leads to 
a crossing with positive sign that helps to counterbalance the 
negative crossings associated with each region of base pairing 
between the loops (Fig. |2a)). This positive crossing comes 



with little free energy cost because it does not lead to any loss 
of base pairing. By contrast, the second positive crossing near 
to the centre of each hairpin loop is associated with a rever- 
sal of the direction of wrapping of the two chains around each 
other and is responsible for the inability of the two loops to 
hybridize further without significant free energy cost. 

The base pairing probability distributions for kissing com- 
plexes with 6 and 10 base pairs in Fig.|4]illustrate how this ten- 
dency to base pair at the extremes of the loops becomes more 
pronounced as the number of base pairs is increased and the 
system becomes more topologically constrained. However, 
the ensembles of such structures are still much more diverse 
than for the fully formed kissing complex. Both the examples 
in Fig.|2{d) and (e) only have a single hybridized region, with 
only the latter being adjacent to one of the hairpin stems. 

The positive crossings of DNA strands associated with par- 
allel four-way junctions have previously been used to off- 
set the negative crossings associated with hybridization in 
so-called "paranemic crossover" motifs.^i^S^ In this motif 
hybridization occurs between bubbles (a series of unpaired 
bases) in two duplexes leading to parallel four way junctions 
at either end of the newly hybridized section (rather than at 
just one end as for the hairpin loops). These junctions can ex- 
actly offset half a turn in each of the helices that result, and 
so can lead to complete hybridization between topologically 
closed species. These paranemic crossover motifs have been 
proposed as an alternative to "sticky" single-stranded ends as 
a means for binding together different molecules in DNA nan- 
otechnologyj^ii^ and have been used in making DNA trian- 
gles^ and octahedral^ Such paranemic crossovers have also 
been shown to form between negatively supercoiled homol- 
ogous duplexes because their zero linking number helps to 
alleviate the supercoilingi^ 

Interestingly, a very similar structure to that depicted in 
Fig. |2b) has been identified for the inhibitory complex be- 
tween the antisense RNA CopA and its target messenger RNA 
CopT— — and has also been suggested for other such com- 
plexesi^ CopA and CopT have small hairpin loops that as- 
sociate to form an initial kissing complex, which then pro- 
gresses to form an "extended" kissing complex, where some 
of the base pairs in the hairpin stems are lost in favour of 
two intermolecular helices that coaxially stack with the rest 
of the hairpin stems at a parallel four- way junction like that 
in Fig. 12 a). This progression is dependent on the presence of 
bulges in the hairpin stems-'' that presumably aid the transfor- 
mation by destabilizing the stems. Intriguingly, the number 
of base pairs in the two intermolecular helices is thought to 
be 15 with six bases in each of the loops connecting the ends 
of these helices^^iS which is extremely similar to the detailed 
structure of our kissing complexes. 

Our results for the topologically unlinked system can be 
compared to the experimental results of Refs. [13] and [l4l on 
the stability of these DNA kissing complexes. Ref. [ij re- 
ported a high yeld (nearly 100%) of kissing complexes at a 
single strand concentration of 8 /iM in a buffer of relatively 
high salt concentration, while Ref. [Tj reported that kissing 
complexes were not stable for their 21 -base loop hairpins but 
at a significantly lower strand concentration (0.1 fiM) and at 



low salt. The relative probability $ of the system being in a 
bound state (one or more interstrand base pairs) compared to 
an unbound state interstrand base pairs) can be inferred from 
the data in Fig. |3] We find $ ~ 3100. Assuming high di- 
lution, this ratio can be extrapolated to a different simulation 
volume v' simply by dividing $ by the ratio v' /v, where v 
is the original simulation volume. The relative probabilities 
of bound and unbound states can than be related to the bulk 
yields f^o as described in Ref. |6ll Extrapolating our results 
to the conditions of Ref. [ij we get <1> = 73.7 and a bulk 
yield /oo = 0.89, which is consistent with the experimen- 
tal result that the kissing complexes were significantly more 
stable than the unbound state. By contrast, extrapolating our 
results to the concentration used in Ref. [ijjwe get <1> = 0.92 
and /oo = 0.37, indicating that the kissing complexes are less 
stable than the unbound states, which is again consistent with 
the experimental findings, especially when we take into ac- 
count that our model is fitted to a much higher salt concentra- 
tion than that used in Ref. JJi and thus is expected to give an 
overestimate for the bulk yield in this case since the electro- 
static penalty for bringing two strands close together would be 
larger at the experimental salt concentration. It should also be 
pointed out that the sequences we study are quite asymmet- 
rical in GC content, but that our model does not account for 
sequence-dependent effects. It is thus possible, depending on 
temperature and salt concentration, that the resulting bonding 
pattern is actually a single helix between the GC-rich regions 
of the loops. 



B. Topologically favoured complex; Lfc = 1 
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FIG. 5. (a) Typical structure and topological sketch of the kissing 
complex with Lk = — 1 and (b) free energy profile associated with 
the formation with the kissing complex, compared to that for the con- 
trol system in Fig.[TJc). To aid this comparison, the two free energy 
profiles were set to have the same value at 1 base pair. 



We next consider topologically linked hairpins. Although 
they are less experimentally relevant than the unlinked case, 
they nicely further illustrate the effect of topology on hy- 
bridization. First we consider hairpins with a linking num- 
ber of -1 (Fig.[TJb)(ii)). In this case the linkage has the same 
sense as the wrapping in duplex DNA and we thus expect 
hybridization of the two hairpin loops to be easier than for 
the unlinked case. The typical structure of the resulting kiss- 
ing complex and the free energy profile for hybridization are 
shown in Fig.|5] There is a much lower entropic cost for ini- 
tial binding as compared to the topologically unlinked system, 
because the two strands are already constrained to be close to 
each other due to the linkage. The free energy profile also ex- 
hibits a much closer to linear decrease with the number of base 
pairs formed than for the unlinked case, and has a minimum 
at 17 base pairs. 

The structure of the resulting kissing complex is quite sim- 
ilar to that for the unlinked case in that it also has a parallel 
four-way junction at the point where the two stems end. The 
reason for this structure is again that the junction provides a 
positive crossing of the strands without any base pairs being 
lost. The typical structure assumed by the kissing complex is 
effectively two parallel helices and a small (2-4 base pairs) 
unbound region where the strands bend back on themselves. 
The topological sketch in Fig. 13 a) shows how the two posi- 
tive crossings (at the four-way junction and at the end of the 



helices) and the linking number of —1 allow the system to 
have four negative crossings, which is topologically sufficient 
to form two double helical turns. Thus, that there is still a 
free energy cost associated with the formation of the last base 
pairs is due not so much to topological constraints but to geo- 
metric constraints arising because the backbones have to bend 
around to bridge the two helices. 

One interesting feature of the structure shown in Fig. |5la) 
is that the stems of both hairpins are only nine base pairs in 
length because the hairpin loops have displaced one base pair 
from each stem. This then raises the question of whether the 
four-way junction could migrate further and lead to the open- 
ing of both hairpins. We note that there are a number of fea- 
tures hindering the junction diffusion. Firstly, junction migra- 
tion is easiest when the junction adopts an "open" configura- 
tion where there is no stacking across the junctioni^^i^ rather 
than the parallel stacked configuration typical of the kissing 
complex. Secondly, the junction migration is resisted by the 
topology. If the total number of base pairs is to remain con- 
stant during migration then the number of base pairs in the 
duplex regions of the hybridized hairpin loops must increase. 
However, as the linking number must also remain constant, 
this also means that the unfavourable left-handed wrapping 
of the unhybridized sections of the loops must increase. Our 
simulations corroborate this picture. We observed that the po- 
sition of the crossover between the end of the two stems is 
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FIG. 6. (a) Typical structure and topological sketch of the kissing 
complex with Lk = +1 and (b) free energy profile associated with 
the formation of the kissing complex, compared to that for Lk = — 1. 



rather stable, although it occasionally did move one or two 
base pairs down. 



C. TopologJcally disfavoured complex; Lk = +1 

Finally, we consider topologically linked hairpins with a 
linking number of +1 (Fig. [nb)(iii)). In this case, the cross- 
ings associated with the linkage are of the opposite sign to that 
for a duplex, and so hinder base pairing between the loops. 
The effects of this topological frustration are clear from the 
free energy profile in Fig. |6] Now, the most stable kissing 
complex has only 8 base pairs between the hairpin loops, and 
is only a few k-^T more stable than the unhybridized state. In- 
deed, it is likely that for a slightly shorter loops the topological 
frustration would be sufficient to totally inhibit binding. 

The effects of topology are underlined by the compari- 
son with the topologically favoured configuration with link- 
ing number Lk = — 1, for which a further 25fcBT drop in free 
energy is obtained on forming the kissing complex. Visual in- 
spection of the structure in Fig |6ja) indicates a much more 
distorted structure compared to the previous cases. In this 
configuration, there is only a single negative crossing (roughly 
enough for one half turn of the double helix), but three positive 
crossings associated with the strands wrapping in the wrong 
sense around each other. 



D. Role of the backbone excluded volume 

Here, we consider how the results for kissing complexes 
depend on our parametrization of the excluded volume inter- 
action between backbone sites. We do this firstly because this 
interaction term will play a key role in determining how easy it 
is for the DNA chains to wrap around each other in the wrong 
sense, and hence how many base pairs can be formed between 
hairpin loops. But, secondly, in our original parametrization 
of the model many of the properties to which we fitted are 
relatively insensitive to this interaction, and so it is possible 
that this parameter does not have its optimal value. For exam- 
ple, in the duplex state, in the high-salt concentration to which 
our DNA model is fitted, the backbone sites are too far away 
from each other for their mutual excluded volume to signifi- 
cantly affect duplex properties. Other properties such as the 
single-stranded persistence length played a greater role in its 
parametrization. The shape of the interaction between back- 
bone sites, modelled as a soft repulsion, is also a significant 
approximation especially when two strands are close together. 

We have therefore repeated the calculation of the free en- 
ergy profile for the complex with Lk = 0, but where we have 
changed the amount of repulsion between backbones by in- 
creasing the effective radius crbb of the coarse-grained back- 
bone site by up to 30%. As shown in Fig. |7] as the repul- 
sion is increased, the average number of base pairs in the 
loop is diminished, and the free energy gain for association 
is significantly lowered. Of course, this change induces a 
large change the yield of kissing complexes at this tempera- 
ture. Since our model's predictions for yields using the origi- 
nal value of (Tbb are reasonably in line with the experimental 
studies reported in Refs.[l3|and|I3, the original parameteriza- 
tion appears to be robust. Moreover, that the detailed pattern 
of base pairing is consistent with known structures for RNA 
kissing complexes^Srii further corroborates this conclusion. 

We also note that at the largest value studied for the range of 
the repulsion the typical structure of the complex has a single 
intermolecular helix. Since one could regard the increase in 
the range of the repulsion as a very crude way to extrapolate 
the predictions of the model to a lower salt concentration than 
the one at which it was parameterized, it is possible that under 
those conditions binding between the hairpins' loops involves 
a single intermolecular heUx between the GC-rich regions. 



IV. CONCLUSIONS 

Our simulations of the systems of kissing hairpins consid- 
ered experimentally by Bois et alM using a recently intro- 
duced coarse-grained DNA model clearly illustrate the effects 
of topology (due to the constraint that the linking number must 
remain constant) on the free energy landscape for the forma- 
tion of a kissing complex. For the unlinked case, this topo- 
logical frustrations leads to 30% of the bases being unpaired, 
and the binding free energy of the kissing complex is signifi- 
cantly smaller than for a fully formed duplex with equal strand 
length (it is equivalent to the binding of a duplex with about 7 
or 8 bases). 
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FIG. 7. Effect of the backbone excluded volume on the free energy 
profiles for a kissing complex with Lk — 0. The original value of 
(Tbb is 0.596 nm. 



The free energy landscapes of the hnked hairpins dramat- 
ically illustrate how manipulation of the linking number can 
increase or decrease this topological frustration. Compared to 
the unlinked system, the number of base pairs in the most sta- 
ble kissing complex increases by 3 in the topologically more 
favourable case (Lj. = — 1) , but decreases by 6 in the topo- 
logically less favourable case (Lk = 1). Even though the two 
sequences are fully complementary, the topological frustra- 
tion also prevents the stems being opened by a displacement 
reaction involving the propagation of the intermolecular he- 
lices formed between the loops. It is this inhibition of duplex 
hybridization that underlies the use of hairpins as fuel for au- 
tonomous DNA nanodevices. 

The structure of the kissing complex is also of particular in- 
terest, as this information is not straightforward to extract ex- 
perimentally. We found that the kissing complex has a some- 
what unusual structure. In particular, rather than having a sin- 
gle hybridized region between the loops, as might have been 
anticipated, the kissing complex involves two intermolecular 
helices that coaxially stack with the hairpin stems, and in- 
volves a parallel four-way junction. This structure is favoured 
because there is a positive crossing of the strands at the junc- 
tion that helps to offset the negative crossings associated with 
hybridization, but without any loss of base pairing. By con- 
trast, the positive crossing nearer the centre of the loops is 
associated with unhybridized bases. For similar topological 
reasons, parallel four-way junctions have also been observed 
for paranemic motifs in which bulges in two separate duplexes 
cross hybridize. Furthermore, a structure very similar to that 
reported here has also been identified for an "extended" kiss- 
ing complex between messenger and antisense RNA that also 
involves parallel helices and a four-way junctioniS— 

As with any study with a coarse-grained model, one needs 
to consider how robust the results are and whether they might 
reflect any weaknesses of the model. We explicitly checked 
this for the repulsion between backbone sites in Section ITlIDI 
and found that although the results could change significantly 
when varying this parameter, our current value appears to be 
physically most reasonable. In particular, the thermodynam- 



ics in our model is consistent with the experimental stabiUty 
of kissing complexes for 20-base hairpin loops in Refs. [Tjand 
uA (after taking into account differences in DNA and salt con- 
centration). Furthermore, that similar structures are seen for 
systems with similar topological constraints suggests that our 
findings are physically robust. 

One of the approximations in the model that we should 
particularly consider is the "average base" approximation, 
namely that that the bases in our model have identical interac- 
tion properties, except that hydrogen bonding can only occur 
between Watson-Crick base pairs. Although the G-C content 
of the hairpin loops is close to half, 7 of those G-C base pairs 
occur in one half of the loop. The consequences of this spe- 
cific sequence might be to make the kissing complex asym- 
metrical with a longer intermolecular helix associated with the 
G-C rich half, or even to lead to the total loss of the second 
more weakly bound helical section. In this regard, it is in- 
teresting to note that the catalyst strand used by Bois et. al. 
to open the kissing complex binds to that half of one of the 
hairpins that is more weakly bound in the kissing comple»i^. 

Although different topological states are only strictly well- 
defined for closed-loop molecules, as we have shown here, 
topological effects can be significant for linear DNA due to 
long-lived secondary structure that leads to the formation of 
internal loops. These topological constraints can inhibit hy- 
bridization and prevent the system reaching the lowest free- 
energy state. DNA nanotechnology takes advantage of these 
effects when using DNA hairpins as fuel for autonomous mo- 
tors, but they could also potentially be an obstacle to the suc- 
cessful self-assembly of DNA nanostmctures. For example 
if a strand hybridizes at its two ends to parts of a second 
long strand, an internal loop results that will be potentially 
restricted in its binding by topological effects, unless one of 
the already hybridized ends unbinds, either due to melting or 
displacement. Therefore, the longer the strands involved in 
a structure, the more likely that topological constraints will 
have a significant effect on the ability of the system to self as- 
semble. This argument suggest that for DNA origamis^ the 
shortness of the staple strands (typically having two or three 
binding domains) probably has the effect of reducing the po- 
tential for topological effects to hinder self-assembly. Further- 
more, the excess of staple strands means that a topologically- 
constrained bound strand can be displaced by one of the equiv- 
alent staple strands from the reservoir in solution. 



V. ACKNOWLEDGMENTS 

We would like to thank the Engineering and Physical Sci- 
ences Research Council for financial support. 

'R. Carlson, Nature Biotechnol. 27, 1091 (2009). 
^N. C. Seeman, J. Theor Bio. 99, 237 (1982). 
^N. C. Seeman, Nature 421, 427 (2003). 
■♦n. C. Seeman, Annu. Rev. Biochem. 79, 65 (2010). 
5j. Bath and A. J. Turberfield, Nat. Nanotechnol. 2, 275 (2007). 
*G. Varani, Annu. Rev. Biophys. Biomol. Struct. 24, 379 (1995). 
^D. Bikard, C. Loot, Z. Baharoglu, and D. Mazel, Microbiol. Mol. Biol. Rev. 
74, 570 (2010). 



°M. A. Glucksmann-Kuis, X. Dai, P. Markiewicz, and L. Rothman-Denes, 

Cell 84, 147(1996). 
'M. a. Oettinger, Nature 432, 960 (2004). 
'"S. V. Santhana Mariappan, A. E. Garcia, and G. Gupta, Nucl. Acids Res. 

24, 775 (1996). 
"S. L. Lam, F. Wu, H. Yang, and L. M. Chi, Nucl. Acids Res. 39, 6260 

(2011). 
12r. M. Dirks and N. A. Pierce, Proc. Nad. Acad. Sci. USA 101, 15275 

(2004). 
'3j. S. Bois, S. Venkataraman. H. M. T. Choi, A. J. Spakowitz, Z.-G. Wang, 

and N. A. Pierce, Nucl. Acids Res. 33, 4090 (2005). 
''•S. J. Green, D. Lubrich, and A. J. Turberfield, Biophys. J. 91, 2966 (2006). 
'^S. Venkataraman, R. M. Dirks, P. W. K. Rothemund, E. Winfree, and N. A. 

Pierce, Nat. Nanotechnol. 2, 490 (2007). 
l^S. J. Green, J. Bath, and A. J. Turberfield, Phys. Rev. Lett. 101, 238101 

(2008). 
"P Yin, H. M. Choi, C. R. Calvert, and N. A. Pierce, Nature 451, 318 (2008). 
l^R. A. Muscat, J. Bath, and A. J. Turberfield, Nano Lett. 11, 982 (201 1). 
''T. E. Ouldridge, A. A. Louis, and J. P K. Doye, Phys. Rev. Lett. 104, 

178101 (2010). 
^°l. Tinoco Jr and C. Bustamante, J. Mol. Biol. 293, 271 (1999). 
^'C. Brunei, R. Marquet, P. Romby, and C. Ehresmann, Biochimie 84, 925 

(2002). 
^^E. Bindewald, R. Hayes, Y G. Yingling, W. Kasprzak, and B. A. Shapiro, 

Nucl. Acids. Res. 36, D392 (2008). 
23 A. J. Lee and D. M. Crothers, Structure 6, 993 (1998). 
^''K. Reblova, N. Spackova, J. E. Sponer, J. Koca. and J. Sponer, Nucl. Acids 

Res. 31, 6942 (2003). 
25p T X. Li, C. Bustamente, and L Tinoco Jr., Proc. Natl. Acad. Sci. USA 

103, 15847 (2006). 
26p T. X. Li and L Tinoco Jr., J. Mol. Biol. 386, 1343 (2009). 
■^^S. Horiya, X. Li, G. Kawai, R. Saito, A. Katoh, K. Kobayashi. and 

K. Harada, Chem. Biol. 10, 645 (2003). 
2^1. Severcan, C. Geaiy, A. Chworos, N. Voss, E. Jacovetty, and L. Jaeger, 

Nature Chem. 2, 772 (2010). 
29W. W. Grabow, P Zaki'evsky, K. A. Afonin, A. Chworos, B. A. Shapiro, 

andL. Jaeger, Nano. Lett. 11, 878 (2011). 
'"F. Barbault, T. Huynh-Dinh, J. Paoletti, and G. Lancelot. J. Biomol. Struct. 

Dyn. 19, 649 (2002). 
3'A. J. Turberfield, J. C. Mitchell, B. Yurke, A. P Mills, M. L Blakey, and 

F C. Simmel, Phys. Rev. Lett. 90, 118102 (2003). 
32g. Seelig, B. Yurke. and E. Winfree, J. Am. Chem. Soc. 128, 1221 1 (2006). 
"B. Yurke, A. J. Turberfield, A. P Mills, F C. Simmel, and J. Neumann, 

Nature 406, 605 (2000). 
'■♦t. E. Ouldridge, A. A. Louis, and J. P K. Doye, J. Chem. Phys. 134. 085101 

(2011). 



'^T. E. Ouldridge, Coarse-grained modelling ofDNA andDNA self-assembly, 

Ph.D. thesis. University of Oxford (201 1). 
3*T. E. Ouldridge, A. A. Louis, and J. P. K. Doye, in preparation. 
3^C. De Michele, L. Rovigatti, T. Belhni and F. Sciortino, in prepara- 

tion(2012). 
3^ J. Bath, S. J. Green, and A. J. Turberfield, Angew. Chem. Int. Ed. 117, 4432 

(2005). 
^'J. Bath, S. J. Green, K. E. Allan, and A. J. Turbei-field, Small 5, 1513 (2009). 
■'"D. Zhang and E. Winfree, J. Am. Chem. Soc. 131, 17303 (2009). 
■"S. B. Smith, Y Cui, and C. Bustamente, Nature 271, 795 (1996). 

T. Ramredday, R. Sachidanandam, and T. R. Strick, Nucl. Acids Res. 39, 

4275(2011). 
■•'M. Nakata, G. Zanchetta, B. D. Chapman , C. D. Jones, J. O. Cross, R. Pin- 
dak, T Belhni, and N. A. Clark,' Science'318, 1276 (2007). 
■•-♦S. Whitelam and P L. Geissler, J. Chem. Phys. 127, 154101 (2007). 
■•^S. Whitelam, E. H. Feng, M. F. Hagan, and P. L. Geissler, Soft Matter 5, 

1521 (2009). 
*^G. M. Tonie and J. P Valleau, J. Comp. Phys. 23, 187 (1977). 
^^S. Kumai; J. M. Rosenberg, D. Bouzida, R. H. Swendsen, and P. A. Koll- 

man, J. Comput. Chem. 13, 1011 (1992). 
''^Sometimes an alternative convention is used where the strands in a DNA 

duplex are considered to be parallel, so that positive linking numbers result 

(Ref.Ei. 
'"a. D. Bates and A. Maxwell, DNA Topology (Oxford University Press, 

2005). 
^^D. M. J. Lilley, Quart. Rev. Biophys. 33, 109 (2000). 
5'X. Zhang, H. Yan, Z. Shen, and N. C. Seeman, J. Am. Chem. Soc. 124, 

12940 (2002). 
"N. C. Seeman, Nano Letters 1, 22 (2001). 
"Z. Shen, H. Yan, T. Wang, and N. C. Seeman, J. Am. Chem. Soc. 126, 1666 

(2004). 
5*W. Liu, X. Wang, T Wang, R. Sha, and N. C. Seeman, Nano Lett. 8, 317 

(2009). 
"W. M. Shih, J. D. Quispe, and G. F Joyce, Nature 427, 618 (2004). 

X. Wang, X. Zhang. C. Mao, and N. C. Seeman, Proc. Natl. Acad. Sci. USA 

107, 12547 (2010). 
^^F A. Kolb, C. Malmgren, E. Westhof, C. Ehresmann, B. Ehi'esmann, 

E. G. H. Wagner, and P Romby, RNA 6, 3 1 1 (2000). 

F A. Kolb, H. M. Engdahl, J. G. Slagter-Jager, B. Ehresmann, C. Ehres- 
mann, E. Westhof, E. G. H. Wagner, and P Romby, EMBO J. 19. 5905 

(2000). 
^'F a. Kolb, E. Westhof, C. Ehresmann, B. Ehresmann, E. G. H. Wagner, 

and P Romby, Nucl. Acids Res. 29, 3145 (2001). 
''"F a. Kolb, E. Westhof, B. Ehresmann, C. Ehi'esmann, E. G. H. Wagner, 

and P Romby, J. Mol. Biol. 309, 605 (2001). 

T. E. Ouldridge, A. A. Louis, and J. P. K. Doye, J. Phys.: Condens. Matter 

22, 104102 (2010). 
'^^L G. Panyutin and P Hsieh, Proc. Natl. Acad. Sci. USA 91, 2021 (1994). 
"P W. K. Rothemund, Nature 440, 297 (2006). 



56 



58 



61 



